AITopics | computational requirement

Collaborating Authors

computational requirement

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Sustainable Precision: Machine Learning for Laser Micromachining Optimization

Correas-Naranjo, Luis, Camacho-Sánchez, Miguel, Launet, Laëtitia, Zuric, Milena, Naranjo, Valery

arXiv.org Artificial IntelligenceDec-3-2025

In the pursuit of sustainable manufacturing, ultra-short pulse laser micromachining stands out as a promising solution while also offering high-precision and qualitative laser processing. However, unlocking the full potential of ultra-short pulse lasers requires an optimized monitoring system capable of early detection of defective workpieces, regardless of the preprocessing technique employed. While advances in machine learning can help predict process quality features, the complexity of monitoring data necessitates reducing both model size and data dimensionality to enable real-time analysis. To address these challenges, this paper introduces a machine learning framework designed to enhance surface quality assessment across diverse preprocessing techniques. To facilitate real-time laser processing monitoring, our solution aims to optimize the computational requirements of the machine learning model. Experimental results show that the proposed model not only outperforms the generalizability achieved by previous works across diverse preprocess-ing techniques but also significantly reduces the computational requirements for training. Through these advancements, we aim to establish the baseline for a more sustainable manufacturing process.

artificial intelligence, laser parameter, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-77731-8_4

2512.02026

Country: Europe (0.93)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Interpretable Physics Reasoning and Performance Taxonomy in Vision-Language Models

Pawar, Pranav, Shah, Kavish, Bhalani, Akshat, Kasat, Komal, Mittal, Dev, Gala, Hadi, Patil, Deepali, Raichada, Nikita, Deshmukh, Monali

arXiv.org Artificial IntelligenceSep-11-2025

In recent years, VLMs have captured the imagination of the Artificial Intelligence(AI) community, demonstrating an impressive ability to interpret, reason about, and generate content that covers both text and image handling. From answering questions about visual scenes to engaging in multi-modal dialogue, models such as Flamingo [1], PaLI [25], and BLIP-2 [14] are redefining the frontier of vision intelligence. Y et, as these models are widening their application capabilities, a fundamental question emerges: can they truly reason, or are they sophisticated pattern matchers? To explore this question, we turn to the domain of physics--a field that serves as a universal benchmark for logical thoughts of a human being. Physics problems are an ideal testbed for VLMs, as they are multi-modal, combining textual descriptions, mathematical equations, and often clarifying diagrams. A model that can successfully solve these problems must not only understand language and images but also grasp the underlying relationships and principles that govern the physical realm. The challenge, uptil now, has been the lack of accessible tools for this kind of evaluation. Existing benchmarks for scientific reasoning, such as ARC [7] and ScienceQA [17], are often limited to basic text-only question sets, while those that incorporate visual elements, like MathVista [18], frequently depend on complex physics simulators that are computationally expensive for many researchers to deploy, thereby restricting reproducibility.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.0827

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

rECGnition_v2.0: Self-Attentive Canonical Fusion of ECG and Patient Data using deep learning for effective Cardiac Diagnostics

Srivastava, Shreya, Kumar, Durgesh, Jiwari, Ram, Seth, Sandeep, Sharma, Deepak

arXiv.org Artificial IntelligenceFeb-22-2025

The variability in ECG readings influenced by individual patient characteristics has posed a considerable challenge to adopting automated ECG analysis in clinical settings. A novel feature fusion technique termed SACC (Self Attentive Canonical Correlation) was proposed to address this. This technique is combined with DPN (Dual Pathway Network) and depth-wise separable convolution to create a robust, interpretable, and fast end-to-end arrhythmia classification model named rECGnition_v2.0 (robust ECG abnormality detection). This study uses MIT-BIH, INCARTDB and EDB dataset to evaluate the efficiency of rECGnition_v2.0 for various classes of arrhythmias. To investigate the influence of constituting model components, various ablation studies were performed, i.e. simple concatenation, CCA and proposed SACC were compared, while the importance of global and local ECG features were tested using DPN rECGnition_v2.0 model and vice versa. It was also benchmarked with state-of-the-art CNN models for overall accuracy vs model parameters, FLOPs, memory requirements, and prediction time. Furthermore, the inner working of the model was interpreted by comparing the activation locations in ECG before and after the SACC layer. rECGnition_v2.0 showed a remarkable accuracy of 98.07% and an F1-score of 98.05% for classifying ten distinct classes of arrhythmia with just 82.7M FLOPs per sample, thereby going beyond the performance metrics of current state-of-the-art (SOTA) models by utilizing MIT-BIH Arrhythmia dataset. Similarly, on INCARTDB and EDB datasets, excellent F1-scores of 98.01% and 96.21% respectively was achieved for AAMI classification. The compact architectural footprint of the rECGnition_v2.0, characterized by its lesser trainable parameters and diminished computational demands, unfurled several advantages including interpretability and scalability.

arrhythmia classification, classification, convolution, (11 more...)

arXiv.org Artificial Intelligence

2502.16255

Country:

Asia > India > Uttarakhand > Roorkee (0.05)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.88)

Add feedback

Entropy Adaptive Decoding: Dynamic Model Switching for Efficient Inference

Simonds, Toby

arXiv.org Artificial IntelligenceFeb-5-2025

We present Entropy Adaptive Decoding (EAD), a novel approach for efficient language model inference that dynamically switches between different-sized models based on prediction uncertainty. By monitoring rolling entropy in model logit distributions, our method identifies text regions where a smaller model suffices and switches to a larger model only when prediction uncertainty exceeds a threshold. Unlike speculative decoding approaches that maintain perfect output fidelity through verification, EAD accepts controlled output divergence in exchange for computational efficiency. Our experiments on the MATH benchmark demonstrate remarkable efficiency gains across different model families. Using the LLaMA family, we maintain 96.7\% of the 11B model's performance (50.4\% vs 52.1\%) while using it for only 43\% of tokens, decreasing computational cost by 41.5\%. These gains become more pronounced with larger size differentials in the Qwen family, where we achieve 92.9\% of the 14B model's performance (74.3\% vs 80.0\%) while using it for just 25\% of tokens, decreasing computational cost by 67\%. The consistency of these results across model pairs suggests that language model computation can be significantly optimized by selectively deploying model capacity based on local generation complexity. Our findings indicate that current approaches to model inference may be unnecessarily conservative in their pursuit of perfect output fidelity, and that accepting minor performance trade-offs can enable dramatic reductions in computational costs.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.06833

Country: North America > United States (0.05)

Genre: Research Report > New Finding (0.90)

Industry: Energy (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)

Add feedback

GestLLM: Advanced Hand Gesture Interpretation via Large Language Models for Human-Robot Interaction

Kobzarev, Oleg, Lykov, Artem, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceJan-14-2025

This paper introduces GestLLM, an advanced system for human-robot interaction that enables intuitive robot control through hand gestures. Unlike conventional systems, which rely on a limited set of predefined gestures, GestLLM leverages large language models and feature extraction via MediaPipe to interpret a diverse range of gestures. This integration addresses key limitations in existing systems, such as restricted gesture flexibility and the inability to recognize complex or unconventional gestures commonly used in human communication. By combining state-of-the-art feature extraction and language model capabilities, GestLLM achieves performance comparable to leading vision-language models while supporting gestures underrepresented in traditional datasets. For example, this includes gestures from popular culture, such as the ``Vulcan salute" from Star Trek, without any additional pretraining, prompt engineering, etc. This flexibility enhances the naturalness and inclusivity of robot control, making interactions more intuitive and user-friendly. GestLLM provides a significant step forward in gesture-based interaction, enabling robots to understand and respond to a wide variety of hand gestures effectively. This paper outlines its design, implementation, and evaluation, demonstrating its potential applications in advanced human-robot collaboration, assistive robotics, and interactive entertainment.

gestllm, interaction, language model, (11 more...)

arXiv.org Artificial Intelligence

2501.07295

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness

Wang, Fali, Zhang, Zhiwei, Zhang, Xianren, Wu, Zongyu, Mo, Tzuhao, Lu, Qiuhao, Wang, Wanjing, Li, Rui, Xu, Junjie, Tang, Xianfeng, He, Qi, Ma, Yao, Huang, Ming, Wang, Suhang

arXiv.org Artificial IntelligenceDec-28-2024

Large language models (LLMs) have demonstrated emergent abilities in text generation, question answering, and reasoning, facilitating various tasks and domains. Despite their proficiency in various tasks, LLMs like PaLM 540B and Llama-3.1 405B face limitations due to large parameter sizes and computational demands, often requiring cloud API use which raises privacy concerns, limits real-time applications on edge devices, and increases fine-tuning costs. Additionally, LLMs often underperform in specialized domains such as healthcare and law due to insufficient domain-specific knowledge, necessitating specialized models. Therefore, Small Language Models (SLMs) are increasingly favored for their low inference latency, cost-effectiveness, efficient development, and easy customization and adaptability. These models are particularly well-suited for resource-limited environments and domain knowledge acquisition, addressing LLMs' challenges and proving ideal for applications that require localized data handling for privacy, minimal inference latency for efficiency, and domain knowledge acquisition through lightweight fine-tuning. The rising demand for SLMs has spurred extensive research and development. However, a comprehensive survey investigating issues related to the definition, acquisition, application, enhancement, and reliability of SLM remains lacking, prompting us to conduct a detailed survey on these topics. The definition of SLMs varies widely, thus to standardize, we propose defining SLMs by their capability to perform specialized tasks and suitability for resource-constrained settings, setting boundaries based on the minimal size for emergent abilities and the maximum size sustainable under resource constraints. For other aspects, we provide a taxonomy of relevant models/methods and develop general frameworks for each category to enhance and utilize SLMs effectively.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.0335

Country:

North America > United States (1.00)
Asia (1.00)
Europe (0.92)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks

Liang, Chia Xin, Tian, Pu, Yin, Caitlyn Heqi, Yua, Yao, An-Hou, Wei, Ming, Li, Wang, Tianyang, Bi, Ziqian, Liu, Ming

arXiv.org Artificial IntelligenceDec-8-2024

This survey and application guide to multimodal large language models(MLLMs) explores the rapidly developing field of MLLMs, examining their architectures, applications, and impact on AI and Generative Models. Starting with foundational concepts, we delve into how MLLMs integrate various data types, including text, images, video and audio, to enable complex AI systems for cross-modal understanding and generation. It covers essential topics such as training methods, architectural components, and practical applications in various fields, from visual storytelling to enhanced accessibility. Through detailed case studies and technical analysis, the text examines prominent MLLM implementations while addressing key challenges in scalability, robustness, and cross-modal learning. Concluding with a discussion of ethical considerations, responsible AI development, and future directions, this authoritative resource provides both theoretical frameworks and practical insights. It offers a balanced perspective on the opportunities and challenges in the development and deployment of MLLMs, and is highly valuable for researchers, practitioners, and students interested in the intersection of natural language processing and computer vision.

large language model, machine learning, natural language, (25 more...)

arXiv.org Artificial Intelligence

2411.06284

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Indiana (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(5 more...)

Genre:

Workflow (1.00)
Research Report > Promising Solution (1.00)
Overview (1.00)
(2 more...)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services (1.00)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(8 more...)

Add feedback

The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities

Parthasarathy, Venkatesh Balavadhani, Zafar, Ahtsham, Khan, Aafaq, Shahid, Arsalan

arXiv.org Artificial IntelligenceAug-23-2024

This report examines the fine-tuning of Large Language Models (LLMs), integrating theoretical insights with practical applications. It outlines the historical evolution of LLMs from traditional Natural Language Processing (NLP) models to their pivotal role in AI. A comparison of fine-tuning methodologies, including supervised, unsupervised, and instruction-based approaches, highlights their applicability to different tasks. The report introduces a structured seven-stage pipeline for fine-tuning LLMs, spanning data preparation, model initialization, hyperparameter tuning, and model deployment. Emphasis is placed on managing imbalanced datasets and optimization techniques. Parameter-efficient methods like Low-Rank Adaptation (LoRA) and Half Fine-Tuning are explored for balancing computational efficiency with performance. Advanced techniques such as memory fine-tuning, Mixture of Experts (MoE), and Mixture of Agents (MoA) are discussed for leveraging specialized networks and multi-agent collaboration. The report also examines novel approaches like Proximal Policy Optimization (PPO) and Direct Preference Optimization (DPO), which align LLMs with human preferences, alongside pruning and routing optimizations to improve efficiency. Further sections cover validation frameworks, post-deployment monitoring, and inference optimization, with attention to deploying LLMs on distributed and cloud-based platforms. Emerging areas such as multimodal LLMs, fine-tuning for audio and speech, and challenges related to scalability, privacy, and accountability are also addressed. This report offers actionable insights for researchers and practitioners navigating LLM fine-tuning in an evolving landscape.

applied research challenge and opportunity, computational requirement, task-specific dataset, (13 more...)

arXiv.org Artificial Intelligence

2408.13296

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)
Research Report > Promising Solution (0.87)

Industry:

Law (1.00)
Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

computational requirement

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

29ef811e72b2b97cf18dd5d866b0f472-Supplemental-Conference.pdf

29ef811e72b2b97cf18dd5d866b0f472-Paper-Conference.pdf

Towards Sustainable Precision: Machine Learning for Laser Micromachining Optimization

Interpretable Physics Reasoning and Performance Taxonomy in Vision-Language Models

rECGnition_v2.0: Self-Attentive Canonical Fusion of ECG and Patient Data using deep learning for effective Cardiac Diagnostics

Entropy Adaptive Decoding: Dynamic Model Switching for Efficient Inference

GestLLM: Advanced Hand Gesture Interpretation via Large Language Models for Human-Robot Interaction

A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness

A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks

The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities